A Stochastic Language Model using Dependency and Its Improvement by Word Clustering
نویسندگان
چکیده
In this paper, we present a stochastic language model for Japanese using dependency. The prediction unit in this model is all attribute of "bunsetsu". This is represented by the product of the head of content words and that of function words. The relation between the attributes of "bunsetsu" is ruled by a context-free grammar. The word sequences axe predicted from the attribute using word n-gram model. The spell of Unknow word is predicted using character n-grain model. This model is robust in that it can compute the probability of an arbitrary string and is complete in that it models from unknown word to dependency at the same time.
منابع مشابه
An improved joint model: POS tagging and dependency parsing
Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...
متن کاملFeature Engineering in Persian Dependency Parser
Dependency parser is one of the most important fundamental tools in the natural language processing, which extracts structure of sentences and determines the relations between words based on the dependency grammar. The dependency parser is proper for free order languages, such as Persian. In this paper, data-driven dependency parser has been developed with the help of phrase-structure parser fo...
متن کاملLanguage modeling by stochastic dependency grammar for Japanese speech recognition
This paper describes a language modeling technique using a kind of stochastic context free grammar (stochastic dependency grammar, SDG). In this work, two improvements are done upon the general CFG based SCFG model. The rst improvement is to use a restricted grammar instead of general CFG. The dependency grammar used here is a restricted CFG that expresses modi cation between two words or phras...
متن کاملAdaptive Hybrid POS Cache based Semantic Language Model
This paper presents a language model as an improvement over the stochastic language model for developing a syntactic structure based on word dependencies in local and non local domain. The model copes with the issues of limited amount of training material and the exploitation of the linguistic constraints of the language. The proposed model is a dynamic probabilistic model which uses word depen...
متن کاملA Stochastic Parser Based on a Structural Word Prediction Model
]in this paper, we present a stochastic language model using dependency. This model considers a sentence as a word sequence and predicts each word from left to right. The history at each step of prediction is a sequence of partial parse krees covering the preceding words. First ore: model predicts the partial parse trees which have a dependency relation with the next word among them and then pr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998